In silico analysis to identify novel ceRNA regulatory axes associated with gallbladder cancer

Competitive endogenous RNA (ceRNA) networks are reported to play a crucial role in regulating cancer-associated genes. Identification of novel ceRNA networks in gallbladder cancer (GBC) may improve the understanding of its pathogenesis and might yield useful leads on potential therapeutic targets for GBC. For this, a literature survey was done to identify differentially expressed lncRNAs (DELs), miRNAs (DEMs), mRNAs (DEGs) and proteins (DEPs) in GBC. Ingenuity pathway analysis (IPA) using DEMs, DEGs and DEPs in GBC identified 242 experimentally observed miRNA-mRNA interactions with 183 miRNA targets, of these 9 (CDX2, MTDH, TAGLN, TOP2A, TSPAN8, EZH2, TAGLN2, LMNB1, and PTMA) were reported at both mRNA and protein levels. Pathway analysis of 183 targets revealed p53 signaling among the top pathway. Protein-protein interaction (PPI) analysis of 183 targets using the STRING database and cytoHubba plug-in of Cytoscape software revealed 5 hub molecules, of which 3 of them (TP53, CCND1 and CTNNB1) were associated with the p53 signaling pathway. Further, using Diana tools and Cytoscape software, novel lncRNA-miRNA-mRNA networks regulating the expression of TP53, CCND1, CTNNB1, CDX2, MTDH, TOP2A, TSPAN8, EZH2, TAGLN2, LMNB1, and PTMA were constructed. These regulatory networks may be experimentally validated in GBC and explored for therapeutic applications.


Introduction
Gallbladder Cancer (GBC) is among the most common malignant tumor of the gastrointestinal system. The only effective treatment is complete surgical resection for this cancer which can be given to only about 10% of patients because this cancer is usually diagnosed at advanced stages, however, recurrence rates after surgical resections are high (Zhu et al., 2010). GBC patients have an overall median survival of 19 months and a 5-year survival rate of 28.8% (Zhu et al., 2020) despite standard treatment, including chemotherapy, radiotherapy and targeted therapy. Therefore, it is necessary to understand the molecular mechanism associated with GBC and identify novel targets for therapeutic applications in GBC.
MicroRNAs (miRNAs) are a class of small non-coding RNAs (ncRNAs) of approximately~22 nucleotides long. Their mRNA targets are based on limited sequence complementarity between the miRNAs seed region (first 2-7 nucleotides from the 5′ end) and regions in the 3′ untranslated region (3' -UTR) of the mRNA. miRNAs play a key role in the negative regulation of target genes and thus change the cellular environment. Nearly 60% of the total known mRNAs are regulated by miRNAs (Friedman et al., 2008). Another group of ncRNA-long non-coding RNAs (lncRNAs), which are >200 nucleotides in length, also affect mRNA stability (Sebastian-delaCruz et al., 2021). LncRNAs containing a 'miRNA response element' can compete with other RNAs and are believed to act as competing endogenous RNAs (ceRNAs). LncRNA-mediated ceRNA regulatory network, namely, lncRNA/miRNA/mRNA axis, is important in promoting tumorigenesis and can potentially serve as a handle to identify key therapeutic targets (Xu et al., 2022).
There are several high-throughput studies available in GBC to identify differentially expressed mRNAs (DEGs) and proteins (DEPs) (Kim et al., 2008;Miller et al., 2009;Huang et al., 2014;Wang et al., 2020a). Knowledge of the non-coding RNAs regulating the expression of these cancer-associated genes/proteins and the corresponding regulatory networks would be important to understand the pathogenesis of GBC and for therapeutic applications. Various groups have analyzed the differential expression of miRNAs (DEMs) in tissue (Letelier et al., 2014;Zhou et al., 2014;Goeppert et al., 2019), serum/plasma extracellular vesicles (Ueta et al., 2021;Yang et al., 2022) using high-throughput studies in GBC. Similarly, other researchers have analyzed differentially expressed lncRNA (DELs) using highthroughput studies (Ma et al., 2016;Wang et al., 2017a;Wu et al., 2017), however, the majority of the studies on lncRNA in GBC are targeted studies. Few of these studies revealed the "lncRNA-miRNA-mRNA network" in GBC in a targeted manner (Wang et al., 2016a;Hu et al., 2019;Yang et al., 2020;Zhang et al., 2020). Another study has constructed a ceRNA network using DELs, DEGs data and predicted miRNAs based on the DEGs from a single GBC dataset (GSE76633) (Kong et al., 2019a). In view of the availability of various targeted and high-throughput studies at lncRNA, miRNA, mRNA, and protein levels in GBC, the construction of a "ceRNA regulatory network" (lncRNA/miRNA/ mRNA axis) based on multiple experimental datasets would be highly relevant.
We aimed to uncover the potential ceRNA regulatory networks regulating cancer-associated processes involved in the development of GBC. The present study applied multi-omics datasets available in GBC, including high-throughput GBC-proteomics data from our lab (Akhtar et al., 2023), and RNA interaction databases (predicted/ experimentally verified) to identify novel lncRNA-miRNA-mRNA regulatory networks in GBC which may be explored for their therapeutic applications in GBC.

Data collection
The literature search was performed and studies with highthroughput expression data in GBC were included for miRNA and mRNA. As the detection of proteins is low in comparison to mRNAs or miRNAs, due to technical limitations, here, both high-throughput studies as well as targeted studies in GBC were used to achieve a comprehensive DEP dataset. The high-throughput proteomic studies include the data from our lab (Akhtar et al., 2023). For lncRNA, only targeted studies were used as their interactions with miRNA are well-annotated.

miRNA-mRNA regulatory axis of gallbladder cancer
Non-redundant lists of DEMs and DEGs were imported into QIAGEN IPA (QIAGEN Inc., https://digitalinsights.qiagen.com/ IPA) (Krämer et al., 2014) and identified the miRNA-mRNA interactions. Similarly, non-redundant lists of DEMs and DEPs were imported into IPA and identified the miRNA-mRNA interactions. miR IDs (for DEMs) and gene symbols (for DEGs and DEPs) were used for IPA analysis.
The expression of lncRNA/miRNA/mRNA/protein was considered "Up" or "Down" based on the expression trend in >50% of the studies. The ones showing an opposite expression in equal no. of studies (50%) were not considered for any further analysis. Similarly, in the case of multiple transcripts of a gene the expression was considered "Up" or "Down" based on the expression trend in >50% of the transcripts.
For IPA analysis, an expression pairing filter was applied to include miRNA-mRNA pairs which are showing an opposite correlation in expressions (miRNA-Up/mRNA-Down; miRNA-Down/mRNA-Up). The confidence level filter was used to include only those interactions which were 'experimentally observed' or 'predicted with high confidence' [the cumulative weighted context score (or "CWCS") as defined by TargetScan is −0.4 or lower]. The datasets from miRNA-mRNA interactions from DEGs and DEPs were integrated and obtained a nonredundant list of miRNA-mRNA interactions. "Experimentally observed" miRNA-mRNA interactions (from the non-redundant list) were selected and miRNA regulatory network was constructed using Cytoscape software v3.9.1 (https://cytoscape.org/) (Shannon et al., 2003).

Functional enrichment analysis
The Search Tool for Retrieval of Interacting Genes database (STRING version 11.5; https://string-db.org) (Szklarczyk et al., 2019) is an online database and tool that can build proteinprotein interaction (PPI) network based on known and predicted interactions. Gene ontology analysis for "experimentally observed" miRNA targets was performed through STRING (cellular components and biological processes) and IPA (molecular and cellular functions and canonical pathways).

Protein-protein interaction analysis
The PPI of the "experimentally observed" miRNA targets were analyzed using STRING 11.5, [Organism: Homo sapiens and PPI Frontiers in Genetics frontiersin.org 02 score was set as 0.9 (highest confidence)]. The network was visualized by cytoscape v3.9.1. Cytohubba, a plugin of cytoscape software, was used to identify the hub genes of the PPI network (Chin et al., 2014). The intersection of the top 10 nodes ranked by degree, closeness, betweenness and bottleneck centrality were considered hub genes.

ceRNA regulatory network construction
DELs (targeted studies) and DEMs (miRNAs associated with hub molecules among the top pathway and the targets reported at both mRNA and protein levels) were used to screen the experimentally validated interaction between them by DIANA-LncBase v3 (https:// diana.e-ce.uth.gr/lncbasev3) (Karagkouni et al., 2020). Both the subunits of miRNA, i.e., "-3p" and "-5p" were considered for finding associated lncRNAs in those cases where the subunits were not specified. Then, lncRNA-miRNA and miRNA-mRNA coexpression pairs (positive relation) were then used to construct ceRNA interaction networks (lncRNA-miRNA-mRNA). The networks were visualized using Cytoscape. Another database, mirTarBase (mirtarbase.cuhk.edu.cn), containing more than three hundred and sixty thousand miRNA-mRNA interactions was then used to categorize the miRNA-mRNA interactions with strong evidence (Reporter assay/Western blot/qPCR) or less strong evidence (Microarray, NGS, pSILAC, CLIP-Seq and others).

FIGURE 1
The overall workflow of the study. DE miRNAs, mRNAs, proteins and lncRNAs in GBC based on a literature survey were used for the study. DE, Differentially expressed; GBC, Gallbladder Cancer; IPA, Ingenuity pathway Analysis; PPI, protein-protein interaction.

Results
The study design and workflow of the study is shown in Figure 1. 'Tissue-based' datasets (DELs, DEMs, DEGs, and DEPs) were used from high-throughput and/or targeted studies and performed IPA analysis for miRNA-mRNA interactions (predicted and experimentally observed) using DEMs and DEGs or DEPs. The "experimentally observed targets" were further used for associated canonical pathways and PPI analysis to identify hub molecules. In addition, miRNA targets DE at both mRNA and protein levels with a positive correlation in expression were also analyzed. Finally, lncRNA-miRNA-mRNA regulatory networks were constructed for the selected targets possibly functional in GBC.

DEM, DEG, DEP, and DEL dataset
Literature search revealed three high-throughput studies on tissue miRNA analysis in GBC (Letelier et al., 2014;Zhou et al., 2014;Goeppert et al., 2019). A total of 382 DEMs (non-redundant Frontiers in Genetics frontiersin.org 04 list) were obtained and shown in Supplementary Table S1. A total of seven high-throughput studies were found analyzing the expression of mRNAs in tissues. A total of 3,135 DEGs (nonredundant list) were obtained and are shown in Supplementary  Table S2. Both high-throughput and targeted studies on the differential expression of proteins in GBC tissues were used for the analysis. Data on DEPs from our lab (Akhtar et al., 2023) was also included. A total of 359 DEPs (non-redundant list) were obtained and are shown in Supplementary Table S3. For lncRNAs, 45 targeted studies representing expressions of 43 different lncRNAs (non-redundant list) were found and is shown in Supplementary Table S4. 3.2 miRNA-mRNA regulatory axis of GBC First, non-redundant lists of DE miRNAs (n = 382) (miRNA IDs) and DE mRNAs (n = 3,135) (gene symbol) were imported into IPA, out of which 372 miRNAs and 2,996 mRNAs were mapped. Inverse expression pairing resulted in a total of 3,906 miRNA and mRNA interactions "Dataset 1" (with high prediction and/or experimentally observed) that includes 278 miRNAs and 1,667 miRNA targets (mRNAs).
Similarly, non-redundant lists of DE miRNAs (n = 382) (miRNA IDs) and DE proteins (n = 359) (gene symbol) were imported into IPA, out of which 372 miRNAs and 356 mRNAs were mapped. Expression pairing resulted in a total of 410 miRNA and mRNA interactions "Dataset 2" (with high prediction and/or experimentally observed) that includes 171 miRNAs and 210 miRNA targets (proteins).
The above two datasets were integrated and obtained a nonredundant list of 4,211 miRNA-mRNA interactions (Supplementary  Table S5). This includes 242 interactions with "experimentally observed targets" (Figure 2) and 3,969 interactions with targets "predicted with high confidence". These 242 interactions include 183 targets (Figure 3, Supplementary Table S6) and were used for gene ontology, pathway, and protein-protein interaction (PPI) network analysis. Out of 242, a total of 11 interactions include 9 targets (CDX2, MTDH, TOP2A, TSPAN8, EZH2, TAGLN2, LMNB1, PTMA, and TAGLN) that are reported to be differentially expressed at both mRNA and protein level in GBC ( Figure 3B; Supplementary Table S7).

Functional enrichment analysis
Gene ontology analysis of 183 proteins through STRING showed that these are localized in intracellular organelle lumen, cytoplasm, membrane-bound organelle, nucleoplasm and chromosome ( Figure 4A). The top biological processes include positive regulation of cellular process, biological process, metabolic process and developmental process ( Figure 4B). Molecular and cellular function and canonical pathways were analyzed through IPA and the threshold criteria considered for the analysis are -log p-value >1.3 or p-value <0.05. The top molecular and cellular functions include cell death and survival, cancer, organismal injury and abnormalities, organismal survival, and cell cycle ( Figure 4C). The top canonical pathways include p53 signaling pathway, pancreatic adenocarcinoma signaling, ovarian carcinoma signaling, Aryl hydrocarbon receptor signaling, cyclins and cell cycle regulation ( Figure 4D; Supplementary Table S8).

Protein-protein interaction analysis
PPI analysis of 183 miRNA targets using STRING was visualized by cytoscape ( Figure 5A). A total of 5 hub molecules (TP53, STAT3, CTNNB1, CDK1, CCND1) were identified based on the intersection of the top 10 nodes ranked by degree, closeness, betweenness and bottleneck centrality ( Figure 5B). Three of the hub molecules (TP53, CCND1, CTNNB1) belong to "p53 signaling pathway" ( Figure 3A). Frontiers in Genetics frontiersin.org 05 3.4 ceRNA regulatory networks ceRNA regulatory networks were constructed for miRNAs associated with 3 hub molecules (TP53, CTNNB1, CCND1) associated with "p53 signaling pathway". Since lncRNAs can bind to miRNA and indirectly regulate the translation of targeted mRNAs, the expression of lncRNAs and mRNAs should be positively correlated (López-Urrutia et al., 2019). The lncRNA-miRNAs-mRNA networks for p53, CCND1, and CTNNB1 is shown in Figure 6. The ceRNA regulatory networks for 8 out of 9 miRNA targets (reported to be DE at both mRNA and protein levels) were also constructed. No lncRNA-miRNA interaction was found for one of the targets, TAGLN. The lncRNA-miRNAs-mRNA networks for CDX2, MTDH, TOP2A, TSPAN8, EZH2, TAGLN2, LMNB1, and PTMA is shown in Figure 7. All the lncRNA-miRNA interactions were reported to be strong interactions except for hsa-miR-23b-3p and associated lncRNAs. The miRNA-mRNA interactions with "strong evidence" as per miRTarBase are shown with thick lines and the ones that are with "less strong evidence" is shown with a dashed line (Figures 6, 7).

Discussion
The molecular mechanism associated with the development and progression of GBC is not clear. An understanding of the ceRNA regulatory networks targeting the "tumor-associated proteins" in GBC would be highly important. In the present study, we integrated Gene ontology analysis of 183 experimentally observed miRNA targets. The top five cellular components (A) biological processes (B) molecular and cellular functions (C) canonical pathways (D) associated with these targets are shown in the figure. The threshold criteria considered for the analysis are -log p-value >1.3 or false discovery rate <0.05. The genes associated with canonical pathways are provided in Supplementary Table S5.
Frontiers in Genetics frontiersin.org 06 the "tissue-based" datasets (lncRNAs, miRNAs, mRNAs and proteins) from high-throughput and/or targeted studies to identify novel ceRNA regulatory networks in GBC. IPA analysis was performed for miRNA-mRNA interactions (predicted and experimentally observed) using DEMs and DEGs or DEPs. We focused on "experimentally observed targets" for associated The intersection of the top 10 nodes ranked by degree, betweenness, closeness and bottleneck. We found 5 hub genes including TP53, STAT3, CTNNB1, CDK1, and CCND1. PPI-protein-protein interaction.
Frontiers in Genetics frontiersin.org 07 canonical pathways and PPI analysis to identify hub molecules. In this study, we included the protein dataset to explore the interactions in which the miRNA target is DE at both mRNA and protein levels with a positive correlation in expression in GBC. Then, lncRNA-miRNA-mRNA regulatory networks were constructed for the selected targets, and the following strategy was used for screening the potential ceRNA regulatory network. First, the miRNA-mRNA interactions with strong evidence were screened as per miRTarBase and a literature survey was done for any report on these interactions in cancer conditions. Then, lncRNA-miRNA pairs were also screened for any report of cancer. Further, using the miRNA-mRNA interactions and lncRNA-miRNA interactions, we propose ceRNA networks (lncRNA-miRNA-mRNA) possibly functional in GBC and may have a potential for therapeutic applications in GBC.
Out of 242, a total of 11 interactions include 9 targets reported to be DE at both mRNA and protein levels in GBC. Out of 9, we found lncRNA-miRNA-mRNA regulatory networks for 8 of them (CDX2, MTDH, TOP2A, TSPAN8, EZH2, TAGLN2, LMNB1, and PTMA) (Figure 7). The same strategy as explained earlier was used for screening the potential ceRNA regulatory networks possibly functional in GBC. We found miRNA-mRNA interaction for a total of 4 genes (EZH2, CDX2, TAGLN2, and PTMA). EZH2 has different roles in cancer, such as oncogenic, tumor suppressor, cancer cell metastasis, cancer immunity, and metabolism. Previous studies have shown its overexpression in different cancers, including prostate cancer, breast cancer, etc. . PTMA is upregulated and associated with the Frontiers in Genetics frontiersin.org 09 development of various cancers, including esophageal squamous cell carcinoma, colorectal, bladder, lung, and liver cancer . The Caudal-type homeobox transcription factor 2 (CDX2) gene is a specific intestinal transcription factor that is involved in the embryonic development and differentiation of the intestine. Its overexpression in gastric carcinoma cells significantly inhibits cell growth and proliferation (Xie et al., 2010). Transgelin 2 (TAGLN2) is known to bind to actin to facilitate the formation of cytoskeletal structures. Downregulation of transgelin 2 is reported to promote breast cancer metastasis .
Overall, we identified twelve ceRNA regulatory networks which might be functional in GBC, however, the experimental validation of these networks (expression analysis) in the clinical samples is the limitation of the present study. In future, in vitro and in vivo studies would be planned that might establish the functional role of these networks in GBC.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author contributions
PG, RS were involved in study design; NS, JA were involved in literature search; NS, VC were involved in bioinformatics analysis; PG, SKU, NS, JA, and VC were involved in data analysis, and manuscript writing; RS, SKU had critically reviewed the manuscript; All authors read and approved the final manuscript.

Funding
The work reported here was financially supported by the Indian

Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.